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PROTEIN ANALYSIS 



The present invention relates to methods for analysing mixtures of proteins. In particular, 
the invention relates to methods to compare proteins between different cells and tissues. ' 
The invention involves the combination of digestion or cleavage of protein mixtures, and 
subsequent analysis of mass. The invention also preferably involves the fractionation of 
proteins or peptide fragments. 

Current methods to analyse en masse complex mixtures of proteins such as in mammalian 
cells or tissues require that the proteins are separated by technologies such as two 
dimensional (2D) gel electrophoresis. For this technology, cellular proteins are usually 
separated on the basis of charge in one dimension and on the basis of size in the other 
dimension. Proteins can either be identified with reference to the electophoresis migration 
pattern of a known protein or by elution of the protein from the electrophoretically 
separated spot and analysis by methods such as mass spectrometry and nuclear magnetic 
resonance. However, limitations of the 2D protein gel method include the limited 
resolution and detection of proteins from a cell (typically only 5000 cellular proteins are 
clearly detected), the limitation to identification of separated proteins (for example, mass 
spectrometry usually requires lOOfrnoles or more of protein for identification), the ' 
specialist nature of the technique and the difficulty in automating the technique in order to 
achieve very high protein analysis throughputs. There is thus a need for superior methods 
to analyse complex mixtures of proteins en masse especially using methods without gel 
electrophoresis and methods which are easy to automate. 

The core of the present invention is that proteins are either digested or cleaved into 
smaller peptide fragments and then subjected to mass analysis especially by mass 
spectroscopy. Preferably, there will also be one or more protein or peptide fractionation 
steps to limit the complexity of the protein or peptide mixture being subject to 
measurement of mass analysis typically as mass-to-charge ratio measured by mass 
spectroscopy. Optionally, proteins or peptide fragments may also be conjugated with a 
"chemical tag" to assist in fractionation. 

The major aspect of the invention provides for cleavage of proteins using proteases or 
chemical methods, fractionation of the peptide mixture thereby produced and subsequent 
mass analysis. One preferred method for fractionation of peptides is by using affinity 
reagents such as antibodies or solid phases or reactive chemical groups to isolate specific 
peptides or mixtures of peptides for subsequent mass analysis. Affinity reagents such as 
monoclonal or polyclonal antibody preparations can be used to retrieve individual peptides 
or sets of peptides from the peptide mixture for subsequent mass analysis. Alternatively or 
additionally, affinity reagents can be used to eliminate peptides from the mixture whereby 
the mixture is itself is subsequently subjected to mass analysis. The affinity reagents can 
either bind by virtue of specific sequences or structures in peptides or by virtue of specific 
chemical groups either as natural constituents of the peptides or as chemical tags which 
are added to the peptides either before or after cleavage. 
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For analysis of larger mixtures of peptides, panels of mixed antibodies such as those 
provided by recombinant libraries of antibody variable region fragments (including single- 
chain antibodies) can be used in order to isolate subsets of peptides for subsequent 
analysis. Such panels of monoclonal antibodies will include a wide range of peptide 
specificities which could be achieved, for example, by pre-absorbing antibody libraries on 
the peptide samples of interest or by immunising animals with peptide samples of interest 
and collecting polyclonal antisera or generating panels of monoclonal antibodies. Then 
individual or mixtures of the selected antibodies are used to isolate (or eliminate) the 
specific subsets of peptides from a test sample. Subsequent mass analysis of a range of 
peptides can facilitate the detection of differences in specific proteins between test 
samples. 

Fractionation of peptides can be achieved using affinity reagents other than antibodies. 
Generation of antibodies to all peptides in a mixture is difficult and is highly dependant on 
the number of peptides in a mixture and the facility for individual peptides to be bound 
with reasonable affinity to antibodies ("antigenicity"). With a very large peptide mixture, a 
limitation is redundancy whereby antibodies with the same peptide specificities are 
repeatedly represented whilst antibodies to other peptide specificities are underrepresented 
or absent. This may cause a particular protein to not be mass analysed if none of the 
peptides from a particular protein are bound by an antibody. Therefore, a particularly 
useful method is to isolate N or C terminal peptides (or both) from a protein by 
preabsorption of the protein to a solid phase via its N and/or C terminus prior to cleavage 
or by chemical tagging of the N and/or C terminus for subsequent isolation after cleavage. 
In principle, this then should lead to recovery of all N and/or C terminus peptides 
representing all proteins from the sample. Such isolation of N and/or C terminal peptides 
is greatly facilitated by the differential reactive nature of the N terminal amino group and 
the C terminal carboxyl group in the protein compared to internal amino and carboxyl 
groups. As an additional step, such isolated N and/or C terminal peptides can then be 
fractionated further prior to mass analysis using other affinity reagents which either 
recognise specific peptide sequences or which recognise chemical tags on the peptides. 
The invention also allows for sequential conjugation of different chemical tags to the 
protein / peptide mixture especially where N or C termini are sequentially exposed by 
specific cleavage of the protein / peptide and whereby the N or C termini (or both) are 
conjugated with a specific chemical tag upon exposure of that termini. This aspect of the 
invention therefore provides for a series of protein fractions with a range of conjugated 
chemical tags introduced at the termini, such fractions being isolated using an affinity 
reagent which binds to the tag. As an alternative to a chemical tag at the terminus of the 
protein molecule, chemical tags can also specifically be attached to non-terminus amino 
acids such that internal peptides can be isolated via an internal chemical tag. 

In another aspect, the present invention provides for cleavage of proteins using proteases 
or chemical methods and subsequent mass analysis without further fractionation. In this 
case, the analysis of protein mixtures is assisted by sequential cleavage cycles whereby the 
spectrum of proteins and peptides are analysed following each cleavage cycle. This 
method could also include chemical tagging cycles between cleavage cycles to increase the 
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mass or steps to remove side-groups such as carbohydrate groups in order to reduce mass. 
If the mass of the range of protein fragments is then determined at the end of each 
cleavage cycle (either with or without chemical tagging, cleavage or other modification), 
then a range of mass distributions will be obtained for each cycle. With an appropriate 
series of mass modification cycles, the result for a single protein or a mixture will be a 
mass spectrum of protein/peptide fragments which is altered at successive cycles; the 
pattern of these alterations will provide a "fingerprint" for the specific proteins/peptides in 
the mixture. The appearance and disappearance of a particular protein/peptide fragment 
of a certain mass following a specific cleavage cycles with or without chemical tagging, 
cleavage or other modifications will provide a fingerprint for identification of the fragment 
sequence especially by reference to a database of such fingerprints. Comparison of the 
spectrum of protein/peptide fragments from different related samples then allows for the 
identification of protein/peptide fragment differences between these samples. Particularly 
useful in this embodiment of the present invention is proteases which specifically recognise 
two amino acids and cleave the protein as a result. An example of such proteases are the 
prohormone convertases which cleave between dibasic amino acid pairs. 

Therefore, the invention provides for novel ways of analysing protein mixtures using a 
combination of protein digestion or cleavage and mass analysis. 

In a related aspect of the present invention, proteins are fractionated prior to cleavage. 
For large protein mixtures, particularly those isolated directly from whole cells or tissues, 
the pre-fractionation of proteins may be desirable in order to reduce the complexity of 
mixtures subjected to subsequent cleavage, peptide fractionation and mass analysis. 
Whilst affinity reagents can be used which recognise sequences or structures in the 
proteins/peptides directly, this will itself require a complex library of affinity reagents such 
as an antibody library and therefore the additional use of chemical tags to provide moieties 
recognised by a set of affinity reagents provides an alternative means of using such 
reagents. More conventional means of pre-fractionation include the use of gel 
electrophoresis either in one or two dimensions where sections of the gel are isolated and 
the proteins within then subjected to cleavage and mass analysis. Other pre-fractionation 
methods include isolation of proteins by virtue of natural modifications such as 
phosphorylation, glycosylation, protein-protein (or peptide) interaction; alternatively, 
membrane proteins can be pre-fractionated or proteins from particular compartments 
within the cell. Another important pre-fractionation procedure is to remove highly 
abundant proteins from the mixture using affinity reagents such as antibodies to bind and 
remove such proteins. As an alternative to pre-fractionation, peptides generated after 
cleavage can also be fractionated by many of these means and also including size/charge 
fractionation methods using HPLC and by virtue of natural modifications using, for 
example, antibodies which bind phosphorylated amino acids within peptides. 
Prefractionation of proteins may also be achieved by using affinity reagents such as 
monoclonal/poly clonal antibodies to isolate specific proteins for subsequent cleavage and 
mass analysis. For such analysis of larger mixtures of proteins, panels of mixed 
monoclonal antibodies such as those provided by recombinant libraries of antibody 
variable region fragments (including single-chain antibodies) are preferred in order to 
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isolate subsets of proteins for subsequent analysis. Such panels of monoclonal antibodies 
will include a wide range of protein specificities which could be achieved, for example, by 
pre-absorbing antibody libraries on the mixed protein sample of interest and then using 
individual or mixtures of the selected antibodies in order to isolate subsets of proteins. 
Such analysis provides mass spectra for a range of different protein fractions thus 
facilitating detection of differences in specific proteins between samples. 

A further advantage of the use of chemical tags is that the subsequent fractionation of 
peptides by affinity reagents can greatly reduce the number of selected peptides from a 
protein molecule with the rest of the molecule thus being eliminated from the mass 
analysis. An especially convenient method for selective chemical tagging is to tag either 
(or both of) the N and C terminus of the protein molecules in the mixture and then to 
digest or cleave the protein molecules with a reasonably selective reagent such as a amino 
acid or sequence- specific protease (such as endopeptidase Arg-C) or cleavage reagent 
(such as acid pH to cleave at Asp-Pro). Using an affinity reagent, N or C terminal 
peptides (or both) from the original protein could then be isolated and all internal peptides 
discarded. This reduction in complexity is then sufficient for mass analysis especially using 
HPLC coupled to a tandem mass spectrometer to analyse the peptides en masse in order 
to identify the individual peptides from the mixture. 

Alternatively, chemical tagging could be performed only after digestion/cleavage, for 
example with the dibasic cutters, the prohormone convertases. This would provide for 
tagging only at one or more internal sites of the original proteins. If the protein mixture is 
then subjected to a second digestion/cleavage step with a different enzyme or cleaving 
reagent, then the size of the tagged peptides would be reduced where a cleavage site was 
present in the original protein. The tagged peptides could then be fractionated using an 
affinity reagent and subjected to mass analysis. 

In another aspect of the current invention, a protein mixture is subjected to cycles of 
tagging, digestion/cleavage and mass analysis, whereby mass analysis is performed only on 
an aliquot of the mixture resultant from use of an affinity reagent binding to the specific 
chemical tag and whereby the master mixture is then subjected to tagging with a different 
chemical tag and digestion/cleavage. This provides sequentially a range of different 
fragments for mass analysis. Another variation on the method involves the same initial 
steps as above but, having exposed new N and C termini after cleavage, one (or both) of 
these new termini can then optionally be tagged with a different chemical which thus tags 
internal sites in the original protein. If required, the process could be repeated one or 
more times with a different protease or cleavage reagent, each time with the addition to 
the N or C terminus of a different chemical tag. In one format of the method, the whole 
mixture of proteins would first be tagged with two different chemical groups at each of the 
N and C terminus and then cleaved with a protease, such as one which specifically cuts 
adjacent to a specific amino acid, and tagged again at the new N and C termini with two 
further different chemical groups. This would result in a mixture of peptides each with 
chemical tags at the termini. As the N and C terminal peptides would have a specific tag, 
these could then be isolated from the mixture using appropriate affinity reagents. Internal 
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peptides without either the initial N or C terminal tags could be isolated using their 
specific tags. The process of digestion and tagging could then be repeated to create 
further peptides with tags. Using specific combinations of affinity reagents for specific 
tags, N or C terminal or specific internal peptides from the original protein could then be 
isolated and selected peptides discarded to achieve a reduction in complexity sufficient for 
mass analysis. Where chemical tags are added to two or more amino acid side groups 
within peptides, sequential use of affinity tags could isolate fractions of peptides 
containing specific combinations of amino acids. For example, if a mixture of peptides of 
average length of 20 amino acids and separately tagged at lysine and phenylalanine and the 
mixture comprise 25% of peptides which include neither lysine or phenylalanine, 25% with 
lysine only, 25% with phenylalanine and 25% with both, then the separate or sequential 
use of specific affinity reagents either for lysine or phenylalanine will result in fractionation 
of peptides into four equal fractions. 

Where analysis of complex protein mixtures is required such as in mammalian cells or 
tissues, the present invention provides a main method where proteins are fractionated 
either before or after cleavage and the peptides are then mass analysed. The fractionation 
of a complex mixture of proteins or peptides either requires a correspondingly complex 
mixture of affinity reagents or one or more affinity reagents which can recognise features 
of the proteins/peptides which are the basis for fractionation. Where cleavage is 
conducted prior to fractionation, the most common method used in the present invention 
is to cleave the whole protein mixture with a protease such as trypsin or V8 (Glu-C) 
protease and to then selectively isolate and mass analyse certain peptides. Commonly, N 
or C terminal peptides (or both) from the peptide mixture are isolated typically by adding a 
chemical tag to the N and/or C terminus of the proteins prior to cleavage and using an 
affinity reagent which isolates peptides with the chemical tag. Alternatively, specific 
peptides (N / C terminal or otherwise) can be isolated using affinity reagents which have 
been selected for binding to specific peptides within specific proteins; these will then select 
out those peptides from the mixture for subsequent mass analysis. Selective isolation of 
peptides then allows for comparative analysis of specific peptides derived from alternative 
protein mixtures for their relative quantities (relating to relative levels of the proteins in 
their respective mixtures) and, in certain cases, for modifications of the peptides. 

For isolation of N or C terminal peptides, the preparation and use of affinity reagents is 
one important aspect of the present invention and the labelling of the N or C terminus of 
proteins is another important aspect. With a typical mixture of proteins from mammalian 
cells or tissues or from many living organisms, several of the N termini of these proteins 
(and some C termini) will be modified (for example, by methylation) such that addition of 
a chemical tag to the terminus may be blocked. In addition, a typical mixture of proteins 
from mammalian cells or tissues or from many living organisms, the proteins will occur at 
different relative levels of abundance including, commonly, certainly highly abundant 
proteins. Where protein mixtures from mammalian cells or tissues or from other living 
organisms are used for the initial selection of affinity reagents, such highly abundant 
proteins may dominate selection of affinity reagents and may be predominant in the final 
peptide mixture for mass analysis. A solution to both of these problems is to use an 
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artificial source of mixed proteins to isolate the affinity reagents. Typically, this will be a 
gene expression system whereby a gene (usually cDNA) library is used to generate the 
proteins without Nor C terminal modifications. In addition, the use of a gene expression 
system allows the gene library to be "normalised" to reduce or remove highly abundant 
genes within the library. This is typically achieved by self-annealing of the DNA (or RNA) 
prior to constructing the library. Therefore, a common method in the present invention is 
to generate proteins by expression of gene libraries (usually normalised) resulting in 
proteins free from significant N or C terminal modifications and, where normalised, 
resulting in a protein mixture free from domination by specific proteins. A typical 
expression system used with gene libraries is in vitro transcription and translation using a 
eukaryotic ribosome preparation; this also provides the possibility of incorporating 
modified amino acids into the expressed proteins. The expressed protein mixture can then 
be used directly for N or C terminal labelling. Other expression systems could also be 
used where N terminal amino groups or C terminal carboxyl groups are not modified or 
prevented from subsequent chemical tagging. Where modification occurs, in some cases 
the N terminal modification can be removed either using enzymes such as histone 
deacetylase or chemical methods such as limited cyanogen bromide cleavage to remove N 
terminal methionines. Having produced a mixture of proteins free from N/C terminal 
modification, chemical tags can then be added to the N/C terminal amino group(s). For 
the N terminus, the e -amino group of lysines can be initially blocked using reagents such 
as citraconic anhydride or methyl acetimidate to then allow only the N terminal amino 
groups to react. Alternatively, the e -amino group of lysines can be blocked by 
incorporating modified lysines into the expression system such as in vitro transciption / 
translation whereby, for example, biotin-modified lysines can be directly incorporated 
instead of lysines. Chemical tags can then be added selectively to the N terminus of 
proteins, for example using isothiocyanates of specific molecules to which an affinity 
reagent is available. One such example is fluorescein which is incorporated by reaction of 
the proteins with fluorescein isothiocyanate allowing subsequent purification with anti- 
fluorescein antibodies. Alternatively, polycarboxylic chelating agents can be incorporated 
as isothiocyantes allowing subsequent purification with specific metals. Once the N and/or 
C termini of proteins in the mixture are tagged, the protein is then comprehensively and 
specifically cleaved either chemically or enzymatically, using proteases such as trypsin or 
another cleaving agent. Such cleavage thereby releases from each protein an individual 
tagged terminal peptide fragment, such collection of fragments which can then be purified 
from the mixture of untagged peptides using an appropriate affinity reagent such as an 
antibody specific for the chemical tag. If required, the size of the chemical tag can be 
increased in order to produce a larger mass for analysis; this would be useful for peptide 
fragments resulting from cleavage very close to the chemical tag whereby the resultant 
fragment might be so small as to be mass analysed within lower molecular weight "noise". 
The chemical tag might, for example, comprise a piece of nucleic acid attached to the 
peptide via a reactive group introduced during synthesis of the nucleic acid. Such a 
nucleic acid molecule might also be useful for isolation of the tagged peptide via annealing 
of the nucleic acid to a complimentary sequence. 
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Following chemical tagging and isolation, the recovered mixture of N/C terminal peptides 
are then used as a "bait" for the isolation of affinity reagents to bind to these same peptides 
from proteins derived directly from mammalian cells or tissues or from other living 
organisms. Such affinity reagents will typically derive from a library of single chain 
antibodies displayed as part of a particle containing the corresponding gene encoding the 
antibody. Examples of such particles are ribosome display particles or phage display 
particles, in each case where the genes from selected antibodies can be rescued in order to 
propagate those specific antibodies. As an alternative, large arrays of antibodies (such as 
recombinant single chain or Fabs, Fvs) can be screened using the N/C terminal peptide 
mixture and antibodies which display binding to the peptides can be recovered via the 
corresponding genes. As another alternative, N and/or C terminal peptides could be used 
to directly generate polyclonal or monoclonal antibodies by appropriate immunisation of 
an animal. By these means, a mixture of affinity reagents is selected which can then be 
used for the analysis of mixtures of proteins such as from mammalian cells or tissues or 
from other living organisms. Such analysis can either involve using the mixture of affinity 
reagents to select out N/C terminal peptides from proteins derived from mammalian cells 
or tissues or from other living organisms or using individual affinity reagents to select out 
individual peptides. The selected peptides can then be mass analysed typically by MALDI- 
ToF (matrix-assisted laser desorption/ionisation time-of-flight) where the individual 
peptides give individual charge: mass ratios which can then be used to identify the peptide 
amino acid constituents. MS-MS (double mass spectroscopy) peptide sequencing can 
subsequently be used to identify the peptide if it can be isolated. Alternatively, the new 
generation of Quadrupole-ToF LC -MS-MS ("Q-ToF") instruments can provide for 
sequential MALDI-ToF and MS -MS within the same instrument. Indeed, affinity reagents 
either individually or in mixtures can be immobilised either indirectly or directly onto the 
desorption chip inserted into the MALDI-ToF instrument and peptides can be 
subsequently bound via the affinity reagents on the chip. In this way, multiple peptide 
fractions adsorbed by multiple affinity reagents at different loci can be analysed on a single 
chip. The use of recombinant proteins as the "bait" to isolate affinity reagents also 
provides the prospect of attaching other tags to those proteins whereby the tags are 
encoded by the gene sequence; for example, a C terminal polyhistidine tag (allowing 
subsequent purification of the tagged fragments using nickel chelates) could be 
incorporated, for example through PCR-mediated incorporation into the gene sequences. 

The use of recombinant proteins as the "bait" to isolate affinity reagents also provides 
another common method of the present invention for specifically isolating peptides using 
tags encoded by the recombinant proteins. Such tags can be conveniently incorporated 
into members of the a gene (usually cDNA) library during its construction or into 
individual clones or groups of clones thereof using specific PCR primers encoding such 
tags and designed to incorporate such tags into the resultant expressed proteins. 
Preferably, such tags will be incorporated into the expressed proteins in all reading frames 
in order to produce a productively tagged protein. Such tags will preferably be 
incorporated via the downstream primer of a PCR reaction with the usual result that the 
tag is produced towards the C terminal end of the expressed protein (although upstream 
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termination codons may prevent this in some clones). However, tags may also be 
incorporated at the N terminal end or in both N and C termini. 

For the isolation of specific peptides from a peptide mixture, the peptide sequences can be 
produced synthetically (or via recombinant DNA) and then, as above, used as the "bait" to 
capture specific affinity reagents. These affinity reagents can then be used to isolate these 
same peptides from a cleaved protein mixture derived from, for example, mammalian cells 
or tissues or from other living organisms. 

As an alternative to selectively fractionating N or C terminal peptides or specific internal 
peptides, modified peptides such as peptides including phosphorylated amino acids which 
can be isolated using antibodies which selectively bind to phosphorylated amino acids 
(tyrosine, threonine or serine or combinations thereof) or using immobilised Fe3+ to trap 
negatively charged peptides. Similarly, peptides modified by glycosylation and other 
modifications can be isolated, in some cases where the peptide modification is further 
derivatised in order to facilitate isolation. For example, carbohydrates can readily be 
modified via periodate reactions as an intermediate to adding chemical tags such as 
fluorescein. 

Mass analysis of proteins and peptides by the present invention is preferably performed 
using mass spectroscopy. In particular, MALDI-ToF analysis has the capability to very 
accurately measure specific mass: charge ratios for individual peptides. This method has 
the capability for simultaneous analysis if thousands of peptides. Above 4kD, the 
resolution of individual peptides (and proteins) becomes poorer such that cleavage of 
proteins into peptide fragments is necessary in order to provide fine resolution. Recent 
methods of interfacing liquid chromatography separation methods (such as HPLC) with 
tandem mass spectroscopy has already permitted the mass spectrum analysis of protein 
mixtures comprising up to 200 proteins. As such proteins are analysed following protease 
digestion, if an average ten peptides per protein is assumed, then the method can analyse 
up to 2000 peptides. Using methods of the present invention whereby, for example, only 
tagged N terminal peptides are analysed, then up to 2000 N terminal peptides derived 
from up to 2000 proteins could be analysed at any one time. As this is not sensitive 
enough for an en masse analysis of mammalian proteins from cells (typically 50,000 per 
cell), then peptides have to be segregated into at least 25 fractions in order for these 
fractions all to be analysed. Such further fractionation can be achieved by the direct use of 
affinity reagents to label internal ends after successive protein digestion/cleavage steps 
following which specific affinity reagents are used to fractionate peptides according to 
their tags. As an alternative to standard mass spectroscopy, MALDI-ToF can be used to 
produce protein mass profiles which can be compared for protein mixtures from different 
cells. 

Chemical tags are typically moieties which can be covalently attached to proteins usually 
at the N or C terminus. For chemical tagging of the N terminus, this is commonly 
undertaken at the terminal amine group. If it is necessary to avoid tagging of the e -amino 
group of lysines, then these can be initially blocked using reagents such as citraconic 
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anhydride or methyl acetimidate. Terminal amine groups are then reactive with a wide 
range of chemical reagents especially using isothiocyanates. Thereby, common antibody- 
recognised ligands such as dinitrophenol and fluorescein can then attach these to the N 
terminus for subsequent fractionation using an antibody affinity reagent. For chemical 
tagging of the C terminus, methods based on carbodiimide activation are commonly used 
to introduce ligands which are bound by affinity reagents. Alternatively, addition of 
moieties to the C terminus of proteins has been described using reverse proteolysis 
whereby certain proteases such as carboxypeptidase Y and lysyl endopeptidase can work 
in reverse to add chemical tags, commonly by way of amino acids either as derivatised 
amino acids with tags for binding to an affinity reagent or by way of natural sequences of 
amino acids which can then be specifically bound by an affinity reagent. It will be 
recognised that a wide range of internal amino acids can also be chemically tagged 
including Lys via the e -amino group, Glu / Asp via the carboxyl group, Cys via the thiol 
group, Ser / Thr via the hydroxyl group and Tyr via the hydroxyphenyl group. Specific 
derivatisations of most other amino acids have been described. It will also be recognised 
that post-translation protein modifications can be used for addition of chemical tags 
especially with glycosylation where the sugar residues are commonly oxidised by periodate 
to formaldehyde groups which can then react with amine-containing molecules. Other 
modifications which can be used to add chemical tags include lipid ation, phosphorylation 
and metal ion addition. It will be recognised that there are a large number of methods in 
the art for introducing one or more chemical tags at specific sites within protein molecules 
or peptides. 

Affinity reagents for use in the present invention are commonly monoclonal antibodies. 
For specific sequences or structures within proteins or peptides, a library of recombinant 
antibody binding sites usually in the form of Fab's, Fvs or single-chain Fv's is used where 
commonly the antibody binding sites are "displayed" using, for example, bacteriophage or 
ribosome complexes such that the gene encoding individual antibody binding sites can be 
recovered. For use in the present invention, libraries of antibody binding sites can be 
dispersed into groups, for example by picking and arraying phage plaques or picking and 
arraying genes in vectors for ribosome display. Such pools will usually contain antibody 
binding sites for several proteins or peptides such that the pools can be used for 
fractionation. Alternatively, the protein or peptide mixture to which libraries of antibody 
affinity reagents are required can be immobilised and used as the target for the pre- 
selection of suitable affinity reagents which are then dispersed into pools or used as 
individual reagents. For chemical tags, individual monoclonal antibodies are used to 
specifically bind to individual tags in order to achieve subsequent fractionation. 

The present invention includes the use of affinity reagents other than monoclonal 
antibodies where such reagents can facilitate the fractionation of peptides or proteins prior 
to mass analysis. Such affinity reagents would include molecules of the immune which 
selectively bind certain peptides such as major histocompatability proteins and T cell 
receptors. Other affinity reagents would include protein domains commonly involved in 
protein-protein binding interactions such as SHI domains. 



9 



Affinity reagents are an important aspect of the present invention and can be used for both 
broad fractionation of groups of proteins/peptides or for specific fractionation of 
individual proteins/peptides. For fractionation, it is first necessary to prepare fractions of 
or individual affinity reagents which binds to a specific fraction or specific peptide and not 
to other fractions/peptides. A convenient method is to fractionate the proteins or peptides 
prior to isolation of the affinity reagents. In the case of antibodies as the affinity reagents, 
such proteins/peptides can then be used either to bind displayed antibodies from a library 
or can be used to immunise animals for generation of ant i sera. Where a library of 
recombinant antibody binding sites such as single-chain Fv's is used, gene clones encoding 
these can be retrieved after binding to protein/peptide fractions providing a replicable 
source of the affinity reagents for subsequent isolation of the specific protein/peptide 
fraction. Individual single-chain Fv's may, in parallel, be screened for binding specificity, 
for example by analysing peptide binding by MALDI-ToF. In this case, single-chain Fv's 
which bind to a single peptide from a large protein mixture are retained (in practice, those 
binding up to three peptides are also retained) as gene clones for subsequent individual use 
or use within a mixture of Fv's for isolation of a protein/peptide fraction from the mixture. 
It will be appreciated that free N termini from proteins are often good targets for isolation 
of very specific antibodies and therefore capture and release of N terminal peptides from a 
protein will particularly favour subsequent antibody isolation. Certain Fv's may be useful 
for the elimination of abundant proteins or peptides from the mixture. It will be 
appreciated that retention and characterisation of the binding of single-chain Fv's may also 
provide a means to reduce redundancy by eliminating Fv's with the same specificity as 
other Fv's. 

The various aspects of the invention cover combinations of protein digestion/cleavage and 
mass analysis with a preferable step of fractionation using affinity tags for specific 
sequences or structures in the proteins or peptides, and an optional step of chemical 
tagging with fractionation by virtue of these tags. The different aspects encompass 
different sequences of these steps as follows; 

1 - repeated digestion/cleavage cycles and mass analysis 

2 - digestion/cleavage, fractionation with affinity reagents, mass analysis 

3 - fractionation with affinity reagents, digestion/cleavage, mass analysis 

4 - terminal chemical tagging, digestion/cleavage, fractionation with tag affinity reagents, 
mass analysis 

5 - as 3 but with additional cycle(s) of tagging, digestion/cleavage, fractionation 

6 - as 4 but with repeated tagging, digestion/cleavage cycles and mass analysis 

The current invention should be considered to encompass these and related 
protein/peptide processing steps with the core objective of reducing the complexity of 
protein mixtures in order to achieve mass analysis of the resultant protein/peptide 
fractions. 

The currently preferred operation of the invention involves tagging the N and/or C 
terminus of a mixture of proteins (either natural or encoded by cDNA libraries), cleaving 
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with a protease, immobilising the N and/or C terminal fragments, using these to bind 
antibodies and then retaining these antibodies as affinity reagents. The mixture of proteins 
may be pre-fractionated, for example by size, or may be produced from cDNA libraries 
which are pre-fractionated by segregation of clones. The retained affinity reagents are 
then used to analyse complex samples of proteins whereby the antibodies are used to bind 
peptides which are then mass analysed by MALDI-ToF. 

It will be appreciated that many of the same principles described herein for the 
digestion/cleavage, fractionation and mass analysis of proteins can also be applied to other 
polymeric molecules such as DNA or RNA. In the case of DNA or RNA, free phosphate 
and hydroxyl groups at the 5' and 3" termini respectively provide a means for very specific 
addition of chemical tags or direct binding to a solid phase. Sequence specific restriction 
or modification enzymes provide for cleavage or modification of DNA molecules. Useful 
affinity reagents for DNA or RNA are nucleic acids themselves which can be specifically 
hybridised to a complimentary DNA or RNA sequence with attachment to a solid phase 
either before of after hybridisation. Using such methods, complex mixtures of nucleic 
acids can be fractionated and then subjected to mass analysis especially using mass 
spectrometry. 

The invention is illustrated by the following examples which some not be considering as 
limiting in scope; 

Example 1 

In this example, human p53 protein was modified with a chemical tag at its N terminus, 
cleaved with a protease, the chemically tagged peptide then recovered using a tag-specific 
monoclonal antibody and the peptide then analysed by MALDI-ToF. p53 protein was a 
gift from Dr Borek Vojisek (University of Brno, Czech Republic). lOOug of p53 protein 
with the succinimide ester of (methyl sulphonyl) ethyl carbonate according to Mikolajczyk 
et al., Bioconjugate Chem., vol 7 (1996) pl50-158 in order to block lysine side-chains. 
The blocked protein was dissolved at lmg/ml in 0.1M sodium bicarbonate buffer pH8.5 
and NHS-SS-biotin (Pierce, Chester, UK) was added to lOOug/ml final. The reaction was 
carried out for 6. hours at room temperature and terminated with ethanolamine. The 
protein mixture was then passed down a Sephadex G25 column (Pharmacia, Milton 
Keynes, UK) in PBS and the void volume collected using A280 measurements of the 
eluates. 40ul of eluate containing 2ug p53 was then heat denatured (95c for 5 mins), 
cooled to 37c and lug endoproteinase Arg-C (from C. histolyticum, Calbiochem, 
Nottingham, UK) was added and the mixture incubated at 37c for 1 hour. Then lOul of 
streptavidin-agarose (Sigma, Poole, UK) in PBS was added and the mixture shaken for 10 
minutes. The agarose was pelleted at 16000g for 1 min and washed three times in TSO 
buffer (75mM Tris.HCl, 200mM NaCl, 0.5% N-octyl glucoside, pH8) and three times in 
TSMK (lOmM Tris.HCl, 200mM NaCl, 5mM 2-mercaptoethanol, pH8). Finally, lOul of 
a saturated solution of alpha-cyano-4-hydroxycinnamic acid in 1% aqueous trifluoroacetic 
aciaVacetonitrile (1:1 v/v) was added to the washed beads and lul of this was loaded onto 
the mass spectrometer chip. The analysis was carried out using a Perseptive Biosystems 
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Voyager-DE STR Biospectrometry Workstation (Perseptive Biosystems). The mass 
spectra were collected by adding spectra from 200 laser shots. 

The results showed a major peak corresponding to the 65 amino acid N terminal Arg-C 
endoprotease fragment with no significant levels of other p53 Arg-C peaks. 

Example 2 

The method of example 1 was repeated except that the N terminal biotin-tagged peptide 
was used to isolate a single-chain Fv antibody fragment from a phage display library of 
single-chain Fv's. Subsequently, the single-chain Fv was used to isolate the N-terminal 
peptide fragment from a protease digest of the test protein as confirmed by MALDI-ToF. 
An extract of normal human brain, prepared as in example 4, was conjugated to KLH 
according to Harlow and Lane, "Antibodies" (1988) (Cold Spring Harbor Publications) 
and used to immunise two BalbC mice. 2 doses were given intra-peritoneally with an 
interval of 4 weeks between them. 3 to 4 days after the 2nd inoculation, the mice were 
sacrificed and spleens removed by dissection. Spleen mRNA preparation was then 
initiated using QuickPrep™ mRNA purification kit (Pharmacia) according to the 
manufacturer's instructions 

The Pharmacia Recombinant Phage Antibody System (Pharmacia) was used to produce a 
library of mouse single chain Fvs (ScFv). First-strand cDNA was generated from the 
mRNA using M-MuLV reverse transcriptase and random hexamer primers. Antibody 
heavy and light chain genes were then amplified using specific heavy and light chain 
primers complementary to conserved sequences flanking the antibody variable domains. 
The 340 and 325 base pair products generated for heavy and light chain DNA respectively 
were separately purified following agarose gel electrophoresis. These were then 
assembled into a single ScFv construct using a DNA linker-primer mix to give the VH 
region joined by a (Gly4Ser)3 peptide to the VL region. The assembled ScFv were 
amplified with primers designed to insert Sfi 1 and Not 1 sites at the 5' and V ends 
respectively, giving an 800 bp product. This fragment was purified, sequentially digested 
with Sfil and NotI, and repurified. The fragment was then ligated into Sfil and NotI cut 
pCANTAB 5 phagemid vector. PCANTAB 5 contains the gene encoding the Phage Gene 
3 protein (g3p) and the ScFv is inserted adjacent to the g3 signal sequence such that it will 
be expressed as a g3p fusion protein. Competent E.coli TGI cells were transformed with 
the pCantab 5/ScFv phagemid then subsequently infected with the M13K07 helper phage. 
The resulting recombinant phage contained DNA encoding the ScFv genes and displayed 
one or more copies of recombinant antibody as fusion proteins at their tips. 

Phage-displayed ScFv that bind to the were then selected or enriched by panning. Briefly, 
the biotinylated and protease treated p53 preparation from example 1 was applied to a 
streptavidin-coated glass slide (Radius Biosciences, Waltham, USA) and the slide was 
washed four times in PBS. After blocking with 2% non-fat dry milk in PBS, the phage 
preparation was applied and incubated for 1 hour. After washing 10 times with 
TBS/0.05% Tween 20, peptide reactive recombinant phage were detected with horse 
radish peroxidase conjugated anti-M13 antibody and revealed with o-phenylene diamine 
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chromogenic substrate. These phage were subsequently eluted with 0. rM glycine.HCl 
pH2.2 and lmg/ml BSA and neutralised with 2M Tris base. The eluted phage were 
amplified in JM103 grown in 25ml J broth. Two additional rounds of panning were 
undertaken and finally 10 single plaques were isolated, pooled and further amplified. An 
aliquot of 10 10 amplified phage was incubated for 2 hours at 4c with 0. lug of biotinylated 
and endoproteinase Arg-C digested p53 in TSO buffer. After 2 hours, 0.5ug of anti-M13 
(Pharmacia) in TSO was added and incubated for 1 hour following which 5ul of protein 
A/G agarose (Sigma) was added and the mixture incubated for a further 0.5 hours with 
swirling. The agarose beads were then pelleted, washed as in example 1 above and 
analysed by mass spectrometry. 

The results showed the same major peak as in example 1 corresponding to the 65 amino 
acid N terminal Arg-C endoprotease fragment. 

Example 3 

In this example, a gene fragment encoding a test protein was subjected to priming with a 
synthetic oligonucleotide encoding a polyhistidine tag. The cDNAs were expressed by in 
vitro transcription and translation (IVTT) and the tagged peptide fragments were then 
isolated using a nickel chelate column. These fragments were then used to isolate a single- 
chain Fv antibody fragment. Subsequently, the single-chain Fv was used to isolate a 
peptide fragment from a protease digest of the test protein as confirmed by mass 
spectrometry. 

Example 4 

The method of example 2 was repeated using a total protein preparation from cells and the 
chemically tagged peptide were used to isolate a collection of single-chain Fv antibody 
fragments. Subsequently, a mixture of twelve of these single-chain Fv's was used to 
isolate peptide fragments from a protease digest of the test protein and analysed by mass 
spectrometry. 
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